Probabilistic Clustering Using Hierarchical Models

نویسندگان

  • Igor Cadez
  • Padhraic Smyth
چکیده

This paper addresses the problem of clustering data when the available data measurements are not multivariate vectors of xed dimensionality. For example, one might have data from a set of medical patients, where for each patient there are time series, image, text, and multivariate data. We propose a general probabilistic clustering framework for clustering heterogeneous data types of this form. We focus on two-level probabilistic hierarchical models, consisting of a high-level mixture model on parameters and a low-level model for observations. This general framework permits probabilistic clustering of \objects" (sequences, histograms, images, etc) using an extension of the expectation-maximization (EM) algorithm which we derive. We further show that earlier (intuitive) clustering algorithms can be viewed as special cases (approximations) of the framework proposed here. The paper includes several illustrations of the method, including an application to a problem in clustering two-dimensional histograms of red blood cell data in a medical diagnosis context. Time (30−second intervals) Lo cu st Id en di ty 20 40 60 80 100 120 140 160 5 10 15 20 Figure 1: Binary activity data from 24 locusts as a function of time (white indicates inactive, black indicates active. The odd numbered (fed) locusts are less active (white) than the even numbered (unfed) locusts.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Probabilistic Hierarchical Clustering Method for Organising Collections of Text Documents

In this paper a generic probabilistic framework for the unsupervised hierarchical clustering of large-scale sparse high-dimensional data collections is proposed. The framework is based on a hierarchical probabilistic mixture methodology. Two classes of models emerge from the analysis and these have been termed as symmetric and asymmetric models. For text data specifically both asymmetric and sy...

متن کامل

Probabilistic Hierarchical Clustering Method for Organizing Collections of Text Documents

In this paper a generic probabilistic framework for the unsupervised hierarchical clustering of large-scale sparse high-dimensional data collections is proposed. The framework is based on a hierarchical probabilistic mixture methodology. Two classes of models emerge from the analysis and these have been termed as symmetric and asymmetric models. For text data specifically both asymmetric and sy...

متن کامل

Fuzzy Hierarchical Location-Allocation Models for Congested Systems

There exist various service systems that have hierarchical structure. In hierarchical service networks, facilities at different levels provide different types of services. For example, in health care systems, general centers provide low-level services such as primary health care services, while the specialized hospitals provide high-level services. Because of demand congestion in service networ...

متن کامل

Determination of the Best Hierarchical Clustering Method for Regional Analysis of Base Flow Index in Kerman Province Catchments

The lack of complete coverage of hydrological data forces hydrologists to use the homogenization methods in regional analysis. In this research, in order to choose the best Hierarchical clustering method for regional analysis, base flow and related index were extracted from daily stream flow data using two parameter recursive digital filters in 43 hydrometric stations of the Kerman province. Ph...

متن کامل

Learning Multiple Hierarchical Relational Clusterings

Three important generalizations of the basic clustering problem are relational, hierarchical, and multiple clustering. This paper proposes the first approach to clustering that unifies all three. We describe a general probabilistic model for relational clustering, and show that flat, hierarchical and multiple relational clustering models are special cases. This paper also describes an efficient...

متن کامل

Running head: PROBABILISTIC CLUSTERING THEORY OF VSTM 1 A Probabilistic Clustering Theory of the Organization of Visual Short-Term Memory

Experimental evidence suggests that the content of a memory for even a simple display encoded in visual short-term memory (VSTM) can be very complex. VSTM uses organizational processes that make the representation of an item dependent on the feature values of all displayed items as well as on these items’ representations. Here, we develop a Probabilistic Clustering Theory (PCT) for modeling the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999